Language Resources for Semantic Document Annotation and Crosslingual Retrieval
نویسندگان
چکیده
This paper describes the interaction among language resources for an adequate concept annotation of domain texts in several languages. The architecture includes domain ontology, domain texts, language specific lexicons, regular grammars and disambiguation rules. This is considered the preparatory phase for the integration of a semantic search facility in Learning Management Systems. The implementation and performance of this search are discussed in the context of related work as well as other types of searches. Also the results from some preliminary steps towards evaluation of the concept-based and text-based search are presented.
منابع مشابه
A Systematic Evaluation of Concept-based Cross-Lingual Information Retrieval in the Medical Domain
The paper describes experiments and results of the MuchMore project1, which is concerned with a systematic comparison of concept-based and corpus-based methods in cross-language information retrieval (CLIR) in the medical domain. Primary goals of the project are to develop and evaluate methods for the effective use of multilingual thesauri in the semantic annotation of English and German medica...
متن کاملIntegrated Language Technologies for Multilingual Information Services in the MEMPHIS Project
The MEMPHIS project integrates a large set of NLP technologies. An overview of components, their underlying technologies and resources will be presented: language identification, document classification, linguistic analysis, summarization, information extraction, machine translation, knowledge management and crosslingual retrieval.
متن کاملA Cross Language Document Retrieval System Based on Semantic Annotation
The paper describes a cross-lingual document retrieval system in the medical domain that employs a controlled vocabulary (UMLS) in constructing an XMLbased intermediary representation into which queries as well as documents are mapped. The system assists in the retrieval of English and German medical scientific abstracts relevant to a German query document (electronic patient record). The modul...
متن کاملHow to Add a New Language on the NLP Map: Building Resources and Tools for Languages with Scarce Resources
Those of us whose mother tongue is not English or are curious about applications involving other languages, often find ourselves in the situation where the tools we require are not available. According to recent studies there are about 7200 different languages spoken worldwide – without including variations or dialects – out of which very few have automatic language processing tools and machine...
متن کاملSPIDER Retrieval System at TREC7
This year the Zurich team participated in two tracks: the automatic-adhoc track and the crosslingual track. For the adhoc task we focused on improving retrieval for short queries. We pursued two aims. First, we investigated weighting functions for short queries|explicitely without any kind of automatic query expansion. Second we developed rules that automatically decide for which queries automa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008